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A major goal for fault-tolerant quantum computation (FTQC) is to reduce the overhead needed for error 
correction. One approach is to use block codes that encode multiple qubits, which can achieve significantly 
higher rates for the same code distance than single-qubit code blocks or topological codes. We present a scheme 
for universal quantum computation using multi-qubit Calderbank-Shor-Steane (CSS) block codes, where codes 
admitting different transversal gates are used to achieve universality, and logical teleportation is used to move 
qubits between code blocks. All circuits for both computation and error correction are transversal. We also 
argue that single shot fault-tolerant error correction can be done in Steane syndrome extraction. Then, we 
present estimates of information lifetime for a few possible codes, which suggests that highly nontrivial quantum 
computations can be achieved at reasonable error rates, using codes that require significantly less than 100 
physical qubits per logical qubit. 

PACS numbers: 03.67.Lx, 03.67.Pp 


Quantum computers (QCs) are extremely vulnerable to er¬ 
rors during the computation process. Theory has shown that 
if errors of each type are sufficiently local, and their rates 
are small enough to fall below a threshold, it is possible to 
carry out quantum computations of arbitrary size with arbi¬ 
trarily small error, by so-called fault-tolerant methods using 
quantum error-correcting codes (QECCs) HHS. Since the 
threshold theorem has been established, a number of fault- 
tolerant schemes have been proposed, including but not lim¬ 
ited to those introduced in Refs. iHs). 

From the practical viewpoint, we say that an FTQC scheme 
Is fault-tolerant if the logical error rates for each elementary 
logical operation in the scheme are sufficiently low so that a 
quantum algorithm can be executed with a high probability 
of success. Typically, a practical quantum algorithm using K 
qubits and Q elementary steps has KQ » 10^°, so the logical 
error rate for each logical operation should be much less than 
10-1° Eol. However, most FTQC schemes require enormous 
overhead to achieve this rate by increasing either the concate¬ 
nation levels for concatenated codes or the distance (hence 
the size) for topological codes. As a result, a logical qubit is 
encoded in thousands of physical qubits ll6l fT0l - [T^ . 

It was observed by Steane more than a decade ago that 
multi-qubit block codes can achieve significantly higher code 
rates for comparable error protection ability, but logical gates 
in these codes are quite difficult to implement HIllIIl. In 
this paper, we propose a scheme that exploits the advantages 
of multi-qubit block codes. Similar to von Neumann architec¬ 
ture, our scheme has three components as shown in Fig. [T] a 
memory array of [[n, fc,d]] CSS code blocks {Cm) ifkSlll^ 
with fc » 1; a processor array of an [[n',l,d']] quantum 
code blocks (Cp) that support a transversal T (tt/S) gate (or 



FIG. 1. The architecture of our teleporation-hased FTQC scheme. 


Other non-Clifford gate); and an ancilla factory that continu¬ 
ously produces a variety of fresh logical ancillas for error cor¬ 
rection, teleportation, and logical operator measurement. An¬ 
other feature of our scheme is that magic state distillation im, 
which usually dominates the overhead of an FTQC scheme, is 
not required as in Refs. lEHIl. 

Universal FTQC.- Quantum information is stored in the 
memory array, and error corrections using Steane syndrome 
extraction 1201 are constantly performed. Logical Clifford op¬ 
erations can be implemented by measuring sequences of log¬ 
ical operators on the Cm code blocks in the memory array. 
We will show that measuring logical Pauli operators of Cm 
can be combined with error correction if some particular an¬ 
cilla states are available. To implement a logical T gate on 
a particular logical qubit, that logical qubit will be teleported 
to a Cp code block, where a transversal T gate is performed, 
and then teleported back to its original memory block. Error 
corrections between these operations can be performed if nec- 
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essary. Again, provided with some particular ancilla states, 
it is possible to simultaneously measure the logical operators 
and implement logical teleportation between Cm and Cp code 
blocks. Thus, universal quantum computation is achieved. 
Also, this scheme needs only ancilla preparation, transversal 
circuits, and single-qubit measurements, and thus it is intrin¬ 
sically fault-tolerant. Details of these logical operations will 
be given below. 

It is evident that a large number of clean ancillas of various 
types are required in this scheme. Fortunately, these logical 
ancillas are stabilizer states. They can be prepared by using 
quantum circuits of Clifford gates only and then distilled M- 
The distillation procedure is more like entanglement distilla¬ 
tion fT2\ than magic state distillation, and has some advan¬ 
tages. For magic state distillation, there is a probability of 
failure, where everything has to be discarded, and it may need 
several iterations; while stabilizer states can be measured, and 
logical errors, if detected, can be corrected. We will not go 
through the details of this logical ancilla distillation in this pa¬ 
per, but simply assume that we have an ancilla factory capable 
of preparing all the ancillas with high fidelity. 

Steane Syndrome Extraction - First let us briefly review the 
Steane syndrome extraction, which leads to the other opera¬ 
tions. It is used to measure the stabilizer generators for the 
[[n, k, d]] CSS code Cm (similarly for Cp if necessary) in our 
scheme. The procedure is as follows: 1) Prepare two ancilla 
states in the same code Cm (or Cp) where all logical qubits 
are set to the states |0)®^ and |+)®^ (called the X and Z 
ancillas respectively), for Z and X error syndrome measure¬ 
ments, respectively. 2) Perform a transversal CNOT from the 
information block to the Z ancilla. 3) Perform a transversal 
CNOT from the X ancilla to the information block. 4) Do 
single-qubit measurements on the X and Z ancilla qubits in 
the X and Z basis, respectively. Collecting the measurement 
outcomes and multiplying the correct subset of ±1 results to¬ 
gether reveals the eigenvalue of each stabilizer generator, and 
hence the error syndrome. This procedure is shown as the left 
circuit in Fig. 
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FIG. 2. The circuit for Steane syndrome extraction and its effective 
error model. 

Measuring logical operators Steane has shown that mea¬ 
suring a logical Xu or Z^ can be combined with the recovery 
operation in Steane syndrome extraction na, where u is a bi¬ 
nary fc-tuple indicating which logical qubits are operated. If 
logical Xu is to be measured, the prepared X ancilla |0 )l at 
step 1) is replaced with |0)l + |u)l, which is also a stabilizer 


state, and the rest of the steps are the same as before. As an 
example, suppose we wish to measure X^Xy. Then the log¬ 
ical qubits i and j of the X ancilla are prepared in the state 
\^+)l = ^(|0i0y)L + \Ulj)L), which is ajoint +1 eigenstate 
of XiXj and ZiZj, and the other logical qubits are prepared 
in the state |0 )l. This logical operator measurement is pro¬ 
tected by the classical error-correcting code from which Cm 
(or Cp) is built up. 

Next we generalize this method to products of logical 
X and Z operators. To illustrate, we show how to mea¬ 
sure logical operators of the form XiZj on logical qubits 
i and j. This will allow us to do any Clifford gate as 
we shall see. In this case, if f t j, the X and Z ancil¬ 
las at step 1) are prepared in a particular entangled logical 
state: logical qubit i of the X ancilla and logical qubit j of 
the Z ancilla are prepared in the entangled state |nij)L = 
l/2(j0i0j)L + \0dj)L_+ \Wj)L-_\ldj)L), which is the joint 
+ 1 eigenstate of XiZj and ZiXj, while the other logical 
qubits of the X or Z ancillas are prepared in the state |0)i 
or 1+) L, respectively. If i = j, we need to prepare the ancilla 
as ajoint +1 eigenstate of YiZi and ZiXi. Again, these joint 
2n-qubit states are a stabilizer states. It is similarly possible to 
measure any logical operator Xu.^v, by preparing more com¬ 
plicated ancilla states. It is important to emphasize that these 
measurements are combined with error correction, as in the 
original Steane syndrome extraction. 

Logical teleportation - We show that logical qubits can be 
teleported between arbitrary code blocks (of Cm or Cp). To 
perform a non-Clifford quantum gate on a logical qubit (or 
qubits) of a Cm memory block, the target logical qubit is tele¬ 
ported to a Cp processor block that allows the non-Clifford 
gate to be implemented transversally. One can think of the 
two code blocks as a part of a larger code. Suppose the log¬ 
ical qubits of Cm have associated with them pairs of logical 
operators (Xi, Zi), (X 2 , Z 2 ),:; (X^, Zk), which are all Pauli 
operators. We reserve logical qubit 1 as a buffer qubit used 
in teleporting qubits. Suppose the logical operators of the 
[[n', 1, d']] code Cp are labeled as (Xq, Zq). Flere is the pro¬ 
cedure to teleport logical qubit j from the storage block to 
the processing block: 1) Measure the operators Xq-^i and 
ZqZi. This prepares a logical Bell state between the proces¬ 
sor block and the buffer qubit. 2) Measure the operators XiXj 
and ZiZj. This does a logical Bell measurement on the buffer 
qubit and qubit j, and teleports qubit j to the processor block. 
3) If necessary, apply a logical Pauli operator to the proces¬ 
sor block to correct the state. One would generally need to 
do correction before applying the non-Clifford gate. 4) Apply 
the non-Clifford gate by a transversal circuit on the proces¬ 
sor block. 5) Measure the operators XgXi and ZqZi. This 
prepares a logical Bell measurement on the processor block 
and the buffer qubit, and teleports the transformed qubit back 
to logical qubit j of the memory block. 6) If desired, apply 
a logical Pauli operator to the memory block to correct. The 
procedure is illustrated in Fig. for a logical T gate. The 
steps of measuring logical operators are similar to what was 
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FIG. 3. (Color online) Diagram of the logical T gate using logical 
state teleportation. The red blocks represent joint measurements of 
logical qubits, and the blue one represents bitwise T or applied to 
the processor block. 


described previously, except that the X and Z ancillas are pre¬ 
pared in a logical entangled state between Cm and Cp. Logical 
teleportation can also be applied, of course, to move logical 
qubits between memory blocks. 

Logical Clifford gates - Clifford gates can be performed 
within a memory block solely by measuring logical operators, 
which is like a kind of simplihed logical teleportation jSJj. 
We demonstrate how to perform a logical Hadamard gate and 
a logical CNOT gate, while the Phase gate and the SWAP gate 
are left to the supplementary material. Suppose we wish to 
perform a logical Hadamard gate on logical qubit i of a Cm 
block in the state Logical qubit 1 is reserved as a buffer 
qubit in the state |0 )l. We do the following two measure¬ 
ments; 1) measure XiZi', 2) measure X^. Logical qubit 1 will 
be left in the state H\ijj)L, up to a Pauli correction Xi, Yi 
or Zi on logical qubit 1, according to the measurement out¬ 
comes. Logical qubit i will be left in the state |+)l or |-)i and 
can be reset to |0) as a new buffer qubit, if desired, by measur¬ 
ing Zi and applying an X^ correction if necessary. A CNOT 
can be performed similarly. Suppose we wish to perform a 
CNOT from logical qubits i to j of a Cm block, and again, 
suppose logical qubit 1 is a buffer qubit in state |0 )l. Here is 
the procedure; 1) Measure XiXj-, 2) Measure ZiZj-, 3) Mea¬ 
sure Xi. This does a CNOT from logical qubits i to j (up to a 
Pauli correction), shifts them to logical qubits j and 1, respec¬ 
tively, and moves the buffer qubit to qubit i in the state |±)l. 
We can build any Clifford unitaries from Hadamard, Phase, 
CNOT, and SWAPs. But a complicated Clifford unitary can 
also be done directly by measuring more complicated combi¬ 
nations of logical operators. Each of these measurements re¬ 
quires the preparation of a particular ancilla state. Therefore a 
tradeoff exists between the efficiency of enabling a larger set 
of possible Clifford operations and the complexity of having 
to prepare and distribute more kinds of ancillas. 

Note also that it is not necessary to apply the Pauli correc¬ 
tions; we can just keep track of them and how they are trans¬ 
formed by Clifford unitaries (the ‘Pauli frame’). But if we do 
wish to correct, that is a transversal operation as well. 

Error model- Steane syndrome extraction and its varia¬ 
tions are used throughout the scheme. There are at least four 
kinds of errors in this scheme; memory errors in the code 
blocks, physical gate errors, faulty ancilla preparation, and 
measurement errors. We model errors in physical gates, an¬ 
cilla preparations, and measurements by treating them as per¬ 


fect operations followed or preceded by Pauli errors. In this 
paper we represent the physical noise model as depolarizing 
errors. At each time step, every physical qubit independently 
undergoes a Pauli error X, Y or Z with probability e/3 or 
remains unchanged with probability 1 - e, where e is called 
the memory error rate. We treat ancilla preparation as per¬ 
fect followed by each qubit of an ancilla block independently 
suffering the same depolarizing errors with rate r afterwards. 
Similarly, we treat each single-qubit gate as perfect, followed 
by a depolarizing error with rate pg^ for single-qubit gates. 
For a two-qubit gate, it is modeled as a perfect gate followed 
by one of the 15 possible single- or two-qubit error from IX, 
lY, IZ, XI, XX, XY, XZ, YI, YX, YY, YZ, ZI, ZX, 
ZY, and ZZ with equal probability pg^/lS, or no error with 
probability 1 -Pgs ■ Finally, the measurement of a single phys¬ 
ical qubit has a classical bit-flip error with probability pm (or 
equivalently, an AT or Z error preceding a measurement in the 
Z or X basis, respectively). Note that we do not expect the 
form of the errors to greatly affect the performance; but the 
assumption of independence across qubits is very important. 

The measurement outcomes in Steane syndrome extraction 
can be erroneous due to either imperfect measurements or er¬ 
rors during the circuit. This makes error analysis difficult. 
Traditionally this is handled by repeated syndrome measure¬ 
ments. Herein we show that syndrome measurement can be 
done in a single shot. Actually, all errors during the syndrome 
extraction process can be mapped to errors occurring on the 
data qubits before and after the process, so that the ancillas, 
gates and qubit measurements in the circuit can be regarded 
as error-free, as illustrated in Fig.|^ This is formally stated as 
the following theorem. 

Theorem 1. (Effective error) During the process of imperfect 
Steane syndrome extraction and its variations, if errors in the 
same block (memory, processor, or ancilla) are uncorrelated, 
then the errors are equivalent to effective errors acting only 
on the data qubits before and after the process. 

The idea is to commute errors forward or backward in the cir¬ 
cuit. This theorem is applicable to quite general independent 
noise models and not just the depolarization channel we focus 
on in this paper. Consequently we only have to decode the 
effective error on the data qubits at each syndrome measure¬ 
ment. Since syndromes are measured at every step of the pro¬ 
cess, this avoids repeated syndrome measurement and greatly 
reduces the time overhead for syndrome extraction and imple¬ 
menting logical gates. On the other hand, it can also eliminate 
potential errors caused by repeated measurements. The esti¬ 
mated effective error will be a Pauli operator and can either 
be corrected instantly, or kept track of in the cumulative Pauli 
frame on the data qubits. After every round, there will gener¬ 
ally be a residual error that has not yet been detected, and that 
differs from the current error estimate. However, this is not a 
problem so long as the weight of this residual error is always 
small compared to the distance of the code. At the same time, 
it greatly simplihes the analysis of error propagation. 

For example, if we assume e = Pm = Pgi = Pg 2 - P^ the 
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effective model on the data block at each time step can be 
approximated by 

71 71 23 

ftot[p] ~ (l-llp)p+ —pXpX + —pZpZ+—pYpY. (1) 

15 15 15 

(Details of this approximation and the proof of Theorem [T] 
are given in the supplementary materials.) For simplicity, we 
choose the error model to be a depolarizing error with rate 
Peff = (71l5)p in the following simulation, where p is the 
underlying physical error rate. 

Estimate of the logical error rate - The error rate for each 
logical step is determined by the failure rate of decoding for 
the effective error process. The performance and the physi¬ 
cal resources of our scheme will depend heavily on the choice 
of quantum codes for the memory and processor blocks. For 
memory blocks, desirable properties include: 1) high dis¬ 
tance, 2) good code rate, and 3) an efficient decoding algo¬ 
rithm. As preliminary research, we study three large block 
codes obtained by concatenating a medium-sized block code 
with a high-distance single-qubit code. By concatenating 
the [[89,23,9]], [[127,57,11]], and [[255,143,15]] quan¬ 
tum BCH codes ll^ with the [[23,1,7]] quantum Golay 
code, we obtain CSS codes with parameters [[2047,23,63]], 
[[2921,57,77]] and [[5865,143,105]], respectively. All 
three block codes on average encode a single logical qubit 
in less than 100 physical qubits. The Golay code is de¬ 
coded using the Kasami error-trapping decoder ll25l . and the 
BCH codes are decoded using the Berlekamp-Massey algo¬ 
rithm E^IZTI . Fig.|^shows the simulation of logical error rate 
for a memory block using Monte-Carlo simulation. However, 
the simulation complexity is too high to go beyond peff ^ 10"^, 
even with the Titan supercomputing resource. Thus, we use 
linear extrapolation to estimate that region, and find that at 
Peff = 0.007 (corresponding to p = 5 x lO""*) the logical error 
rates are less than 10“^® for all the three codes. 

We can also derive an upper bound on the expected logi¬ 
cal error rate of an [[n, k, c?]] code. Since all errors of weight 
up to t = [^] can be corrected, we pessimistically assume 
that any error of higher weight would lead to a logical er¬ 
ror. The approximate logical error probability is P„(p) = 
“ p)"“’"- At effective eiTor rate peft = 0.007, 
we get P89(-P23(Peff)) = 1 X 10^^®, Pl27(P23(Peff)) = 2.5 X 
10“^®, and P 255 (P 23 (Pefr)) = 7 x 10"^'^, which agree to the 
simulation results. We see that the [[5865,143,105]] code 
stands out because of its high code rate and extremely low 
logical error rate, making it a very promising code in practice. 

For the processor block, we need a CSS code that al¬ 
lows a transversal non-Clifford gate such as the concate¬ 
nated [[15,1,3]] shortened Reed-Muller code Il2^ or the 3D 
gauge color code li29l . Here we simulate the concatenated 
[[15,1,3]] code, which allows a transversal T gate. This 
code has the ability to correct almost all bit-flip errors when 
Peff is small. There exists an optimal efficient soft-decision 
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FIG. 4. (Color online) The logical error rate of the memory blocks 
for the [[2047,23,77]] code (blue), [[2921,57,77]] code (red) and 
[[5865,143,105]] code (green) versus physical error rate p. The num¬ 
ber of samples for each point is up to 4 x 10®. The dashed lines are 
from extrapolation of linear fitting. 



0.00030 0.00050 0.00070 0.00100 0.00150 0.00200 


P 

FIG. 5. (Color online) The logical error rate of the logical T gate 
performed on the concatenated [[15,1,3]] code of two levels (blue) 
and three levels (red) using up to 3 x 10^ samples for each point. 
The green point represents a numerical upper bound for three levels 
when the physical error rate is p = 7 x 10^“^. The dashed lines are 
from extrapolation of linear fitting. 


decoder for concatenated codes Eol, which diagnoses the 
error syndromes using a message-passing algorithm ED. 
We performed Monte-Carlo simulations of the concatenated 
[[15,1,3]] code of two and three levels with the soft-decision 
decoder. The results are plotted in Fig. Note that for three 
levels of concatenation, the logical error rate drops to less than 
2 X 10~^^ at Peff = 0.007 (by extrapolation). Hence, when the 
physical error rate is less than 5 x 10“"^, the logical error rate 
is below 2 x 10“^^ for a single round of syndrome extraction 
and all logical gates. In this case, the error rate for each log¬ 
ical operation is well below 10~^°, which will allow interest¬ 
ing quantum algorithms that are impossible to run on classical 
computers. Nor do we have any reason to believe this is op¬ 
timal: in all likelihood, better codes exist for both the storage 
and processor blocks. 

Conclusion - We have proposed a scheme for FTQC us¬ 
ing large block codes as memory blocks and the concatenated 
[[15,1,3]] Reed-Muller code as processor block. We showed 
that its logical error rate can be made low enough to imple¬ 
ment practically interesting quantum algorithm with reason¬ 
able physical error rates. The number of physical qubits re¬ 
quired to protect a single logical qubit can be reduced from 
thousands of qubits to hundreds of qubits or less, and no magic 
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state distillation is needed. It is very likely that good codes, 
such as quantum LDPC codes Il32l434l . may allow memory 
blocks with even better performance lIlSl . The memory error 
rate could be further reduced if we exploited the correlations 
between effective errors before and after the syndrome extrac¬ 
tion circuits. 

On the other hand, the use frequency of each clean ancilla 
state varies dramatically. The ancillas |0)®^ and |+)f * for syn¬ 
drome extraction are used much more often than those for spe¬ 
cific measurements, teleportation and logical gates. Thus, the 
distillation protocols should vary for different ancilla states to 
maximize the throughput of ancilla generation. The total re¬ 
sources needed depends strongly on the details of the ancilla 
distillation protocols, which will be carefully investigated in 
our future work ED. 

We thank Daniel Gottesman and Jim Harrington for useful 
discussions, and the Oak Ridge National Lab for providing 
the Titan supercomputing resource. This work was supported 
in part by the lARPA QCS program; by HRL subcontract No. 
1144-400707-DS; and by NSF Grant No. CCF-1421078. 
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Supplementary Materials 


PHASE GATE AND SWAP GATE 


In this section, we show the details of implementing logical phase and SWAP gates in the same block using logical state 
measurements. As shown in Fig. SI a), assuming qubit 1 is in state {ij}), the phase gate can be realized by 1) preparing a buffer 
state |0) as qubit 0; 2) measuring XqYi followed by measuring Zi. It will leave the state as S'|'0)|O) up to a Pauli operator 
correction. For the SWAP gate, we prepare a buffer state initially in |0), as shown in Fig. |Sl{b). To SWAP |^/>i)|i/) 2 ), we can 
1) measure XqXiX^ followed by ZqZiZ 2 , followed by Xq. This will leave the state as |+)|V'2)|'0i) up to a Pauli operator 
correction. The buffer qubit can be reset to |0) by measuring Zq. 


(a) 


qubit 0|0) 
qubit 1 W 



|o> 


(b) 


qubit 0 
qubit 1 
qubit 2 



|o) 

1^2) 

ki) 


FIG. SI. (a) Phase gate, (b) SWAP gate. The red blocks represent joint measurements of logical qubits. 


EFFECTIVE ERROR MODEL 


There are three types of errors introduced by imperfect circuits in syndrome extraction and logical state measurement in our 
scheme. These are: measurement errors, gate errors, and preparation errors. Theorem [T] states that it is possible to replace these 
errors with equivalent errors before and after the circuit if they are all independent in the same block. We can then use these 
equivalent error processes as our error model and treat the circuit as being ideal. We treat these errors one at a time. 

Every error in a Z measurement is equivalent to a single X error followed by a perfect measurement, while every X measure¬ 
ment error can be modeled by a single Z error followed by a perfect measurement. In Fig. S2 we can see that an error in the Z 
measurement is equivalent to two X errors, before and after the circuit, on the codeword qubit. An error in an X measurement 
has a similar effect. 


(a) 


(b) 















FIG. S2. The effect of measurement errors on the ancilla qubit can be effectively replaced by errors before and after the circuit. 

We treat a noisy gate as a perfect gate followed by an error. We treat the first CNOT explicitly here, and the situation for the 
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second is just the same. An error in the CNOT gate is represented as a tensor product of two operators from {/, X, Y, Z}. Each 
such operator can be equivalently replaced by errors before and after the circuit. See panels (a) to (f) in Fig.|S3| 


(a) 


(b) 


-[m^ 


-E> 


-T^ 


—I Mx^ 


(c) 


(d) 


—I Mx^ 


—I Mx^ 


—I Mx) 


—I Mx^ 


(e) 


(f) 


—I Mx^ 




Mx^ 


—I Mx^ 


-dp 


- i 

) - ! - 

- i 

^ -&T 

©- 

- (mz) = - C 

) - 

—E 1 -dH 

) - 

1 1 1 


t0H 

— -dh 

- i 

*-d]- 


-g 

- -dhr 

©- 

-[mz) ~ -( 

)— 

—E) 

! -dH 

1- 

-fwz) ~ i 


- i 

— -dh 

- i 

1 — 

—i 

)— — 


-[i^ — -* 

)— 

—i —i 

)— 

- [mz) ~ -e 


-( 

) - 

-g 

i - i - 

-6 

^ -&T 

i-dk 

-[mz) ~ -( 

) 

—E) —* 

1- 

-[mz) ~ i 



- i 

>— -di- 

- i 

*0- 1 

-{ 

P -&T 

- i 

Hy]- 

—E) — —* 

)— 

—E) 1 —* 

)- 

—E) — —® 


(g) 


-dH 

) - - 

- i 

0- 1 

- i 

!— -di- 

©- 

—E) — —‘ 

)- 

—E : -dH 

>- 

1 1 1 


(h) 


-[mx) 


-[Mx) 


(i) 


(j) 


—I Mx) 


-0- 


—I Mx^ 


(k) 


(I) 








—I Mx^ 


- 0 - 


-E) 

—I Mx^ 


-E) 


-E) 




0) 

-|~M^ 


-E) 

—I Mx^ 


0) 

-[ mx ) 


FIG. S3. The effect of gate errors (from (a) to (f)) on the first CNOT, and ancilla preparation errors (from (g) to (i)). Both types of errors can 
be replaced by errors before and after the circuit. 


Preparation errors include errors in initializing the ancilla blocks, and any memory or transport errors that occur in storing or 
distributing them. The replacement of preparation errors on each qubit of the two ancilla blocks is shown in panels (g) to (1) in 


Fig. S3 


From the argument above, we see that we can replace the noisy circuit with a perfect circuit preceded or followed by a 
noisy process on the codeword qubits at a single time step. Note that the errors before and after a circuit are correlated. It 
may be possible to use this correlation to improve the estimation of the error, based on the entire time record of syndrome 
measurements. For now, we ignore this in finding the error process for a single time step. If there are also memory errors 
£rn on qubits in codeword block, then the error process before the circuit is ftot = £i ° £m ° £f as in Fig. S4 Note that 
£f here is the from previous circuit, and £m is the memory error. For simplicity, suppose that all errors are Pauli errors, 
and define I[p] = p, A’[p] = XpX, Z[p] = ZpZ, 3^[p] = YpY. Since we model memory errors as depolarizing errors. 




































































































































8 



FIG. S4. Effective error model in a single time step of circuit 


£-m = (1 “ + 3^ + Z). £f can be derived as follows: 

£f = [{I - Pm)Z + PmX] o [{1 - Pm)Z + PmZ] 
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(l- ^Pg^^T+ ^Pg^X O ^Pg^{X + y + Z) 


(SI) 


first CNOT 


second CNOT 


(-r) 


-r\I+ -rX 


{l-r)I+-r{X + y + Z) 


ancilla preparation 


Similarly, we could have £i as: 


- Vm)Z + PmX] o [{I - p^)I + p^Z] 


(-i-) 


T^Pa2 ) Z + Y^Pg2 {X + y + Z) 


( 8 \ 8 
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Thus, the entire noise process can be represented as: 


(S2) 


^tot — £i ^ £m ® £f 

/I 4 „ 16 /I 4 „ 16 \ /I 2 8 \ , 

J I+ 0(max{e^,r^,pl^,PgJ). 
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e + y+ 4pm + -Pg.. 


(S3) 


If we set e = prn = r = Pg^ = p, f total can be approximated by f total ~ (1 - llp)T + ^pX + ^pZ + f|p>’. 

For non-Pauli errors, we can apply a similar trick by expanding the noise process in the Pauli basis and translating the errors 
term by term. In that case, each term also has a phase, which may depend on the measurement outcomes. This makes analysis 
more complicated but not really different in principle, especially since the syndrome measurements will tend to project into the 
Pauli basis. 


STABILIZER FORMALISM 

The n-qubit Pauli group Gn is the set of all operators of the form 

i^0i®02®---®0n, (S4) 


where Oj € {/, X, Y, Z} Vj. 

We ignore the overall phase in this section, which represents a global phase. In general, for each [[n, k, c?]] stabilizer code 
with stabilizer group S, we can always do encoding operation from canonical code Ilk) = |O)®""^|'0), where \tp) is a fc-qubit 
state is the quantum state to be protected and used in the computation. This is a stabilizer code, whose logical operators are 
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Xi and Zi for i = n - fc + 1,..., n and whose stabilizer generators are Zi for i = 1,..., n - fc. We can take the symplectic 
partners to be for i = 1,..., n - fc. This is the “code” that we start with before the encoding operation. The encoding of fc 
qubits into n qubits can be specified by a unitary operation Ue ^Cn, where C„ is the normalizer of the Pauli group Gn in C7( 2") 
(Clifford group). Note that given a code, the encoding circuit is not unique. Here we choose a particular Ue- Under the encoding 
operation, the first n - k Z operators are mapped ton - k stabilizer generators with relation Si = UEZiU^, whereas the images 
of the X operators acting on those qubits, Tj = UeXiU^^, are called pure errors or symplectic partners of Si- The image of the 
Pauli operators acting on the last fc qubits are logical Pauli operators Xi = UEXn-k+i Ul;, Zi - UEZn-k+iUl-- 

Given a Pauli error S' e Gn, we can always find the corresponding syndromes (assuming no measurement errors), so we 
define the syndrome extraction function S ■ Gn -*■ {-1,1}""^, where S(S) gives the syndromes of error S. We can also 
define the following function; T : Gn Gn, to represent pure error string. T(S) is uniquely determined by S’s syndrome 
s s S(S) = {si, S 2 ... Sn-k}, and can be explicitly represented as: 


T(S) = Ue 





which is a product of pure errors. Define another function: 

£:Gn^Gn, S^ST(S), 


(S5) 


(S 6 ) 


where £(S) is a product of logical Pauli operators in the normalizer of stabilizer group N(S). It is easy to observe that the S 
can be uniquely decomposed as: 


S = £(S)T(S). (S7) 

Last, we define a function which truncates the last fc single Pauli operators of a length n Pauli operators string; 

: Gn -* Gk, s !->• Sn-k^V"Sn (S 8 ) 

The images of the last fc X and Z Pauli operators of the canonical code correspond to the logical operators of the code. If we 
ran back the encoding circuit, {U'^^£{S)Ue) should indicate of logical errors actually happen according to S. Note that if 
for S and S', the corresponding £ and £' are different up to multiplication by some element in the stabilizer group, they should 
be regarded as equivalent. So when decoding, the only thing that matters is to estimate an equivalence class L containing all 
equivalent £. Each equivalence class L corresponds to a unique L e G^, which is what logical error L stands for, and can be 
represented as L = Sf (U^£Ue) for any £ e L. So the optimal decoding problem can be transformed to a maximum a posterior 
(MAP) probability problem: 


L = argmaxP(L| s). 


(S9) 


For small quantum codes concatenated together (like the concatenated [[15,1,3]] code that is used in our scheme), an op¬ 
timal and efficient decoder called the soft decision decoder does exist ED. Consider an /-level concatenation code, and let 
the jth level be an [[?r_,, fcj, dj]] code with kj = 1 for j + 1. The parameters of the corresponding concatenated code are 

[[n^n^.fci , 11^=1 rfj]]. Define € {-1,1}”^ U be the syndromes of ith block of the jth concatenation layer after syn- 

(i) 

drome extraction . Denote be the collection of syndromes whose stabilizer generators act nontrivially on all physical 
qubits associated to the ith block of the jth concatenation layer. In other words, these sets of syndromes can be defined as; 


(0 

s) = 




{} U 1 U [■ Note that are the syndromes of codes in bottom level. At last, denote = U ® 

\p=inj-i+l I 2=1 


(*) 


to be the collection of all the syndromes from the layers j to /. Then it is easy to see that Si is the set of all syndromes according 
to the concatenated code’s stabilizer generators. Estimating L for the concatenated code is equivalent to estimating Li for the 


code at the top level. The decoding is equivalent to finding argmax/,, P{Li\si). Define L 2 = (£3 
logical operators at the second level. Then this probability can be factorized by conditioning on L 2 


( 1 ) 


. £ 3 "^^) ™ array of 


P{Li\ si) = Si,L 2 )P(L 2 | Si) 

L2 

y S[L, = J£,{Ul^UUh)] = 5 i(L 2 )] ( 0 ^ 


(SIO) 
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where Lj, and Sj are L, Ue and S corresponding to the code at the jth level, and (5(-) is the indicator function. 

The derivation repeatedly uses the Bayes rule and the following facts; a) The syndromes and logical errors of level j are 
determined by the logical errors of layer j + 1. b) The channel is not correlated (or more specifically, that the error model can 
only correlate qubits in the same block but not across different blocks). The optimal decoding is reduced to a sum-product 
problem on a tree which can be exactly and efficiently solved using message passing algorithm. If the estimated logical error is 
Li = ^{UI,C{S‘)Ue), the decoding is a success, otherwise it is a failure. The error at the physical level then can be estimated 
as: 

i = fJs (Sll) 

which is used for actual correction at the physical level. 

In general, the optimal decoding problem is NP hard MS2II . In practice, one must use codes with some structure (like the ones 
chosen for the memory block) so that an efficient hard decision decoding algorithm provides an error estimate S' based on the 
assumption that errors are all independent and the same type of error is equally likely to happen to different qubits. Efficient hard 
decision decoders for quantum BCH codes and the quantum Golay exist. The Golay code can be decoded up to its correctability 
t = J = 3 using the Kasami error-trapping decoder. The BCH codes can be decoded using the Berlekamp-Massey algorithm . 

We can decompose S into S = C{S)Ts, where Ts = T{S) = T{S). If L = ^ (^UI,C{S)Ue) is equal to L = .if [uI,C{S)Ue), 
we declare that the error correction is correct, otherwise, we declare that a failure. 

To determine whether a correction of a random Pauli error is correct, we need to run the encoding circuit backward. So first 
of all, we need to find the encoding circuit of a stabilizer code. Using the symplectic form, this can be done in the following 
way. Let the symplectic matrix representation of the n - k stabilizer generators and logical operators be M. A Clifford circuit is 
an automorphism of Pauli group Gn, which can be represented as a linear map that acts on the matrix: 

M ^ M'= MC, (S12) 


where C is a nonsingular 2n x 2n binary matrix representing the action of the Clifford circuit. Since C represents a unitary 
transformation, it must preserve the commutation relations of the operators it transforms; this restriction corresponds to the 
constraint 


CJC^ = J with J = 


0 I 
I 0 


(S13) 


Note that the n + k rows of the matrix M do not form a full basis. This reflects the fact that n + k Pauli operators they represent 
do not form a full set of generators. We can supplement them by adding an additional n-k symplectic partners of the stabilizer 
generators logical operators to M to form a new full rank matrix M. 

The symplectic partner Ti of Si should satisfy 


m,Sj] = o for i*j, {r„50 = o. 


(S14) 


Start from the canonical code we described before, and define the matrix Mq to represent this “code”. We use the following 
order for the operators: Xn, ■.. ,Xi, Zn, ■ ■ ■ Zi. This gives us the very simple matrix 


In-fc 

0 

0 

0 ' 

0 

Ik 

0 

0 

0 

0 

In-A; 

0 

0 

0 

0 

Ik _ 


(S15) 


where the first n-k rows are the symplectic partners, the next k rows are the logical X operators, the next n-k rows are the 
stabilizer generators and the last k rows are the logical Z operators. 

From this code, we produce the [[n, k, d]] code we are interested in by applying an encoding circuit Ue, which has represen¬ 
tation Ce- 


M = MaCE = CE. 


(S16) 


In other words, if we know all logical operators and symplectic partners of the stabilizer code, we can build the encoding matrix 
of that code directly. Suppose we know all of the stabilizer generators and logical Z operators and Xi, for I = we can 

recursively find Xi+i by solving the following linear equation : 


[^1 - 


S{\zl - zjfex^^i 


0 I 0 - 0 I 0 


0 1 0 


(S17) 


n+k 


k-i-1 
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where 0 is the symplectic inner product between binary strings. Similarly, if we find all Si, Zi, Xi and Ti for I = we could 
find Ti+i by solving the following equation: 


[n - nin 


I 0 
I ‘^n+k 


s{\zl - Zi 


0 I 0^^ |^0_^ 1 0 

k n+k-i-1 


0 I 0 - 0 

k 


(S18) 


QUANTUM BCH CODE AND QUANTUM GOUAY CODE 


Since the size of a concatenated code is the product of the code sizes at each layer, choosing codes that are not too big is 
important if one wishes to concatenate them. The Golay code and the BCH codes IIS3I are well-known classical cyclic codes 
due to their remarkable algebraic structures and the ability to decode multiple errors that is especially useful when the size of 
the code is small. The scheme we consider for protecting the memory blocks against error is to use two-layer concatenation 
of a quantum BCH code at the top layer and the [[23,1,7]] quantum Golay code at the bottom layer. The quantum codes that 
we consider here are derived from their classical counterparts by the CSS construction. We pick the following three quantum 
BCH codes constructed from self-dual classical BCH codes that have reasonable code lengths, good code rates and good code 
distances: 

1) [[89,23,9]] quantum BCH code, which is derived from the [89,56,9] classical BCH code. The generator polynomial of the 

classical code is + X^°+ X^'^+ + + + + + + + + X^°+ X^+X^+X^+ X^+ 1 . 

2) [[127,57,11]] quantum BCH code, which is derived from the [127,92,11] classical BCH code. The generator polynomial 
of the classical code is X^^+X^‘^ + X^^+X^^+X^^ + X^^+X^^ + X^^+X^'^+X^^ + X^^+X^^+X^+X^+X^+X^+X^+X + 1 . 

3) [[255,143,15]] quantum BCH code, which is derived from the [255,199,15] classical BCH code. The generator polyno¬ 
mial of the classical code is + X^° + X‘^^ + X*^ + + X^^ + + X^o -t X^e + X^^ + X^^ + X 22 + 

X 20 + ^ j^i 6 + + xio + X® + X^ + X^ -I- X3 + X2 + X + 1. 

Note that the [[23,1,7]] quantum Golay code is derived from the [23,12,7] classical Golay code. The generator polynomial 
of the classical code is X^^ + X^° -t X® + X® + X'^ + X^ + 1 . 

In all four cases, logical Zi operators can be obtained by shifting the corresponding generator polynomials for classical codes. 


[[15,1,3]] REED-MULUER CODE 


The concatenated [[15,1,3]] code is constructed from the truncated classical Reed-Muller code. Classical Reed-Muller codes 
are weakly self-dual codes with simple and good structural properties IIS41 . A Reed-Muller code has two parameters r,m, 
0 <r < m, and is denoted by RM{r, m). This code is of length 2"* and r is called its order. Let C = i?M(l, 4) with parameters 
[16,5,8]. Consider the following (m+l) 2"*-tuples: 


[ 1111 - 

1111 1111 

- nil ], 

[ 0101 - 

0101 0101 

- 0101 ], 

[ 0011 - 

0011 0011 

- 0011 ], 

[ 0000 - 

0000 nil 

- nil ], 


C is generated by vq, vi, V 2 , and V 4 . The codewords of C have weight divisible by 8 . Let C" be the code of C shortened 
by deleting the first bit. Then its codewords have weight either 0 or 7 mod 8 . Let u' be the punctured generator by deleting the 
first bit of vj, and then C' is a [15,5, 7] code generated by Uq, v[, v' 2 , v'^ and v' 4 . Let Cq be the even subcode of C' generated 
by v[, v' 2 , v'„ v' 4 . Note that vi, V 2 , U 3 and Vi have leading bit 0, but uq has a leading bit 1. Cq has parameters [15,4,8] and its 
codewords have weight divisible by 8 . The [[15,1,3]] quantum code is encoded as following: 

io)i= Z! 1^)’ 

u^C'q u^Cq 


(S19) 
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The corresponding parity check matrix for the [[15,1,3]] code is 

'000000011111111 

000111100001111 

011001100110011 

101010101010101 

000000011111111 
000111100001111 
011001100110011 
101010101010101 ’ 
000000000001111 
000000000110011 
000000001010101 
000001100000011 
000010100000101 
001000100010001 _ 

which is asymmetric between X and Z. Then 

^ T^nu + xvo)= E + 

U€Cq U^C'q 

=e 4 2 ^ \u + xv^) = e 4 1 ^^^^ 

u^Cq 


(S20) 


which implements the logical T gate. The third equality follows because for any v € Cg, wt(u) = 0 mod 8, while for any 
V € Cq + vq, wt(u) = 7 mod 8. 

This code has a particularly large ability to correct almost all bit-flip errors when peff is small. Monte-Carlo simulation of 
two and three level concatenation of [[15,1,3]] using soft decision decoder has been shown in Fig. The simulation of 3-level 
concatenation using the soft-decision decoder is quite time consuming. It was implemented on the Titan supercomputer at Oak 
Ridge National Lab. We use 10® samples for peff = 0.04, 10^ samples for = 0.03, 3 x 10^ samples for peff = 0.02 and 3 x 10^ 
sample forpeff = 0.015. 

Note that for peff < 0.01, the logical error rate is so small that the number of samples needed is too large for simulation. Even 
on Titan, direct Monte Carlo simulation is much too expensive. We need to be careful here, since we are using a message passing 
algorithm, so the error floor effect in classical message passing decoding might potentially occur in the low error rate region. 
To evaluate the performance in the region peff < 0.01, we can use the following observation. The distance for X errors in the 
3-level concatenated [[15,1,3]] code ([[3375,1,27]]) is 343, which means the code can correct all X error of weight less than 
170 since the decoder is optimal. So we just need to focus on Z errors. Define PLiWjPes) as the logical error probability after 
soft decision decoding using parameter peff when Pauli errors of weight w uniformly occur on each qubit. Define {w,peff) 
as the logical error probability when Z errors of weight w uniformly occur on each qubit. Consider an all-Z error E of weight 
w < 170 on a set of qubits Let E' be arbitrary string of X errors supported by If we can correct E, then E' • E can be 
corrected. Thus, we have PL{w,Peff) = (|) Pl i^TPes) for rf < 170. Then the logical error probability using soft-decision 
decoder can be bounded as follows: 


3375 

FiCPeff) = E^ 

W=1 ' ' 

- 1 (T) (W . E (T) (W 

tf;=14\ / ^i;=35 ' ' 

^/'3375\/2\“ 2 X3375-™ ^ V /'3375\ p . . 3375 -™ 

+ I ^ 1 ( 3 ) -Pt («), Peff)Peff(l-Peff) + E I y, IFi (w, Peff)Peff(1 “ Peff) 

<- t (^)”(^t^>f(34,Peff)Peff(l-Peff)— . E ( ^ ) ^ ( (50, Peff )Peff(l - Peff) ^ 

^ /2\“/3375\ „ ,3375-™^ V /'3375\ 

it)=51 ' ' ti;>100 ' ' 


Peff(l-Peff)^ 


(S21) 


We evaluate P£(34, peff) and Pf^ (50, Peff) for peff = 0.01 and get an upper bound on the logical error probability of 2 x 10 
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